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On the problem of boundaries and scaling for urban street networks 

A. Paolo MasucciQ Elsa Arcaute0 Erez Hatnalll Kiril StanilovH and Michael BattjEl 

(Dated: October 28, 2015) 

Urban morphology has presented significant intellectual challenges to mathematicians and physi¬ 
cists ever since the eighteenth century, when Euler first explored the famous Konigsberg bridges 
problem. Many important regularities and scaling laws have been observed in urban studies, includ¬ 
ing Zipf’s law and Gibrat’s law, rendering cities attractive systems for analysis within statistical 
physics. Nevertheless, a broad consensus on how cities and their boundaries are defined is still 
lacking. Applying an elementary clustering technique to the street intersection space, we show that 
growth curves for the maximum cluster size of the largest cities in the UK and in California collapse 
to a single curve, namely the logistic. Subsequently, by introducing the concept of the condensation 
threshold, we show that natural boundaries of cities can be well defined in a universal way. This 
allows us to study and discuss systematically some of the regularities that are present in cities. We 
show that some scaling laws present consistent behaviour in space and time, thus suggesting the 
presence of common principles at the basis of the evolution of urban systems. 


Since the middle of the twentieth century universal 
properties of cities have been identified, including Zipfs 
and Gibrats laws Qi. City size has been measured 
most commonly in terms of built area or population since 
Zipf’s seminal book [l| notwithstanding that most of the 
time city boundaries have been defined in terms of often 
arbitrary, fixed administrative boundaries. 

Many different techniques to define cities have been 
suggested based on the analysis of urban growth M, 
and recently a method using demographic and commut¬ 
ing data has been proposed [6|. Clustering techniques 
such as the City Clustering Algorithm (CCA) have been 
applied, mostK to analyse satellite images and demo¬ 
graphic data [3-[9|, but these are rarely parameter free. 
A method proposing a bottom up approach that does not 
rely on highly aggregated census data or on the interpre¬ 
tation of remotely sensed images is needed. 

When we define a city we have to keep in mind that 
built area and population are strongly correlated [9[ , but 
these correlations, as we show in this paper, do not nec¬ 
essarily carry universal exponents. The interpretation of 
the empirical outcomes using these definitions have to be 
therefore put into context according to the methodology 
employed. 

As pointed out in [6[ , a broad range of exponents based 
on different allometries inferred from urban studies 0 
[Hi can be observed for different boundary definitions. 
This further supports the urgent need for an operational 
and context-free definition of the city. It is somewhat 
astonishing that in spite of the large body of literature 
about cities, the very concept of city remains in some 
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ways obscure, hidden or assumed. 

In this paper we present some universal properties of 
cities which emerge when applying an elemental cluster¬ 
ing technique to the vertices and edges of street networks. 
We obtain a logistic growth curve from which the struc¬ 
tural fringe of the city can be defined mathematically 
in a bottom-up approach. This is achieved by obtaining 
the parameters at the point at which a condensation phe¬ 
nomenon is observed as we will explain below. The curves 
for all cities then collapse to a single curve, and city 
boundaries are hence defined in a universal way. Such 
universality in the spatial properties of cities prompts us 
to look at the spatial and temporal behaviour of impor¬ 
tant properties of urban street networks, and thus inves¬ 
tigate whether some scaling laws could display a general 
behaviour. 


I. RESULTS 

A city is a complex organism, composed of many su¬ 
perimposing layers, such as transportation networks, the 
built environment , a nd different economic, social, and 
information flows |12l4l4j. Such layers are dynamical by 
nature, and give rise to generic patterns, such as fractal 
geometries [l2, [l3| . Administrative boundaries overlook 
these aspects, and are not able to measure or record the 
dynamical aspects of cities in a consistent way across 
space. 

Between others, street networks provide a good repre¬ 
sentation to characterise the morphology of a city, where 
a street network is defined as that planar graph where 
the street intersections N are the vertices and the street 
segments E are the links. We will consider here street 
intersections as being a good proxy for the urbanization 
process. Such a choice reduces the complexity of the 
problem to that of a spatial point pattern. This has 
the value of simplicity. Moreover, it has been positively 
tested before [H, llQ with some correlations between the 
number of street intersections and built area for urban 
systems are shown in the Supporting Information (SI), 
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FIG. 1: Logistic growth for the maximum cluster size 
in a clustering process: the condensation threshold. 

Top Panel: Maximum cluster size NMaxir) as a function of 
the threshold r for Greater London on a semi-log plot. The 
solid line is the logistic function fit of Eq. [H The dashed 
line represents the carrying capacity C, while the dotted line 
shows the condensation threshold f, defined as the threshold 
where NMaxij) — C. Bottom Panel: the maximum cluster 
(red) at the condensation threshold for the London. 


Sec. I-D. 

Considering a spatial window large enough to contain a 
given city and using an elementary clustering technique 
[l^ . we consider two street intersections to belong to 
the same cluster if they have a distance below a given 
distance threshold r, where r is measured in meters. In¬ 
creasing r enlarges the size of the clusters, until eventu¬ 
ally a giant component appears, which spans the entire 
street network. 

We measure the maximum cluster size Nuaxi'^) 
terms of number of intersections as a function of the in¬ 
creasing threshold r, and we find that for all the cities 
Nuax (t) grows exponentially and eventually the growth 
slows down and the curve condensates to a certain value 
(see top panel Fig.l). This behaviour has been positively 
tested for all the largest cities in the UK and in Califor¬ 
nia, suggesting that the maximum cluster size behaviour 
for cities highlights universal properties of urban mor¬ 
phology (see SI Sec. II for more details). 


A. The condensation threshold 

The function defined by NMax(^), be., exponential 
growth followed by condensation, has the characteristics 
of the logistic growth function: 

C 

NMax{T) = ( 1 ) 

where C is the carrying capacity, r is the growth rate and 
To is the inflection point |18| . 

Following Eq. [H we show that for cities in the UK 
and in California, NMax{^) grows as until the inflec¬ 
tion point To, and after that it condensates at a con¬ 
stant value given by the carrying capacity C. In or¬ 
der to do that, given the transformation {r* = r(r — 
ro),A/'* = NMax{^)/C}, we expect that all the mea¬ 
sured curves would collapse to a single curve, namely 
A*(r*) = 1/(1+ e-^*). 

We test this hypothesis for the 61 largest cities in the 
UK and for the 52 largest cities in California (see Fig. [2] 
and SI, Sec. I). The results are shown in Fig. [3l and we 
can see that for both cases there is a very high correlation 
{R^ > 0.99) for the quality of the collapse. This corre¬ 
lation is maintained if the maximum cluster size is mea¬ 
sured according to the number of street segments E{r) 
instead of the number of intersections. In this case we 
find that the collapse is estimated with an > 0.98. 

These results indicate that the proposed clustering 
technique is able to capture generic properties of urban 
street networks. In order to investigate this further, we 
look at how the logistic form of Eq. 1 is related to ur¬ 
ban morphology and whether it allows us to define in a 
rigorous way the boundaries of a city. 

As the logistic function is associated to the Verhulst 
model [TsI, it is interesting to understand how the carry¬ 
ing capacity C, always referring to a reservoir in the sys¬ 
tem, could be associated to our clustering approach. To 
understand this, we notice that the largest cluster grows 
in the area where the intersection density is large, i.e., the 
urban area (See Fig. 1 as a visual reference). The exis¬ 
tence of a condensation phase shows that there exits an 
abrupt transition between the urban area and the rural 
area, where the intersection density consistently drops. 
Hence, the reservoir could be interpreted as the set of 
intersections belonging to the urban network which are 
consumed while the maximum cluster grows, and then 
the carrying capacity represents the city size in terms of 
street intersections. 

Following the clustering analysis introduced above, 
when r grows after the logistic condensation phase, 
^Maxi^) starts to grow again (see Fig.l). This is be¬ 
cause after the maximum cluster reaches the condensa¬ 
tion phase, as r grows rural intersections and small towns 
close by get absorbed by the maximum cluster. In such 
a way NMax{^) exceeds the carrying capacity C. 

We define the city condensation threshold f, as the 
threshold where the measured maximum cluster size 
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FIG. 2: The UK and California datasets with land-use satellite comparison In the left panel a large portion from the 
Corine data set for the UK map representing a satellite image with land-use. In the right panel the California satellite land-use 
map. The red parts are identihed as urban areas, while the black contours are the city condensation boundaries as defined in 
the text. Note that throughout this paper, we refer to towns and cities as being in the UK when strictly we are excluding those 
in Northern Ireland 


intersects the carrying capacity of the fitted lo¬ 
gistic function, i.e. f = r : NMax{^) = C. The city is so 
defined as the maximum cluster at the city condensation 
threshold, as we show in Fig. 1 for London. In order 
to investigate whether the city boundaries obtained in 
this way bear any resemblance with the urbanised space, 
we overlap the given contours with land-use satellite im¬ 
ages. Fig. [2] demonstrates clearly that the city bound¬ 
aries as defined via the condensation threshold delimit 
the so called urban fringe, i.e. the spatial pattern related 
to the city’s expansion. 


B. Space and time scaling relations 

In the following section, we try to understand the 
meaning of different allometries that are usually found 
in urban studies and we examine them in spatial and 
temporal terms. To pursue this, we analyse a few simple 
global statistical properties of the spatial networks: the 
network total street length L{N), measured in meters, 
which is the sum of the lengths of the street segments 
for a given network; the network area A{N), measured 
in square meters, which is the area embedded by a given 
street network; the street intersection density P(n), ob¬ 


tained by imposing a 400 meters side square grid on the 
top of the street network, and counting the number n 
of intersections falling in each cell [28[ . These quantities 
are quite sensitive to the structure of the network and 
some of them have been considered in different studies 

[nllil Ellis HI. 

The following analysis shows that urban street net¬ 
works, as defined via the condensation threshold, dis¬ 
play statistical properties which are consistently different 
from the statistical properties of rural street networks 
[29|. Moreover, we show that the allometric exponents 
obtained for the above mentioned properties are com¬ 
patible for cities in the UK and for cities in California. 
Remarkably, we find that these exponents are compatible 
with the ones found for the growth of London during the 
last two centuries. 

The network total street length L{N) is a global quan¬ 
tity characterizing the nature of the underlying network. 
We can write that L = E{1), where E is the number of 
street segments and (/) is the average length of a street 
segment, if such a quantity can be well defined. Then, 
considering that the average degree of the network can be 
written as (k) = 2E/N, we have L = {l){k)N/2, where 
the density distribution for both I and k have finite mean 
and variance. 
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FIG. 3: Growth curve collapse for the cities in the UK and in California Panels a, c: rescaled maximum cluster size 
N* = NMax{r)/C as a function of the rescaled threshold r* = r{T — to) for the largest 61 cities in the UK and for the largest 
52 cities in California. The dashed curve is 1/(1+ e“^). Panels b, d\ in order to evaluate the goodness of the collapse of the 
curves in the hgures in panels a and c, we plot in the horizontal axis the N*xp = NMax{T)/C values for the cities in the UK 
(panel b) and in California (panel d) and in the vertical axis, the estimated value via the logistic function = (1 + 

Then we calculate the value of the resulting points with the dashed curve y = x and we hnd that > 0.99 for both UK and 
California cities. Panels e-h: the same methodology as explained for panels a-d is applied for the number of street segments 
E{t) for the cities of the UK and California. In this case we hnd that the quality of the logistic collapse is given by > 0.99 
for the UK and R^ > 0.98 for California. 


We hnd (see Fig. 01 panel a) that for cities in the UK 
the behaviour of L{N) is consistent with a linear function 
of N. On the other hand, for the rural street network in 
the UK, we hnd a dihFerent behaviour statistically signih- 
cant for the same quantity {p-value=0.007)^ which scales 
in a sub-linear way, i.e. L{N) oc The linear rela¬ 

tion for L in urban networks is due to the independence 
of (k) and (/) by N, while the sub-linear relation for L 
in the rural network is due to the sub-linearity of {1{N)) 
for those networks (see SI, Sec. III). 

In the case of cities in California (Fig. IH panel 5), we 
hnd that the behaviour of L{N) is consistent with that 
of the UK in a slightly super-linear regime, i.e., L{N) oc 
^1.04^ On the other hand, for the rural street network in 
California, we hnd that L{N) is sub-linear, i.e. L{N) oc 
and it is not consistent within the error range 
neither with that of the California urban street network 
{p-value=0.0003) nor with that of the UK rural street 
network. 

In panels c and d of Fig. 01 we see that the exponent 
for urban network areas A{N) in the UK and in Califor¬ 
nia are quite similar, following a very mild super-linear 
relation, i.e. A{N) oc On the other hand, super¬ 

linearity can be statistically discarded for both exponents 
for the rural case in the UK {p-value=0.000004) and in 


California {p-value=0.0004)- In addition, it is important 
to note that for the rural networks, the exponents for the 
UK and California are notably different. Linearity can 
discarded for California, while this is not the case for the 
UK. 

These differences reflect the contrast in the spatial pat¬ 
terns of the street networks covering these two countries. 
In particular, the nearly linear relations found for the 
urban areas reflect the fact that street intersections are 
generally homogeneously distributed within the urban 
fringes. Such homogeneity can be seen from the street 
intersection distributions P{n) shown in panels e and / 
of the same figure. In this case again we find very similar 
patterns between the UK and California, where P{n) is 
well fitted by a logistic distribution in the case of urban 
street networks. This is a bell shaped distribution with 
a well defined average and variance, while it is ill defined 
for rural street networks. 

The analysis above highlights the fact that urban street 
networks are characterized by an overall homogeneous 
texture, which is consistent between the two different 
countries considered in this work. In the same way, 
we can observe how rural street networks differ consis¬ 
tently from urban street networks and between different 
countries, displaying an overall inhomogeneous structure. 
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FIG. 4: Statistical properties of urban networks Panels 
a, b: total length of the street network as a function of the 
number of vertices L(N) for the UK and California. Panels e, 
d: total area of the street networks as a function of the num¬ 
ber of intersections A(N) for the UK and California. Panels 
e, /: density distribution P(n) for the number of intersections 
contained in a square grid lattice of 400 m. side for the UK 
and California. 


Hence, we find that for urban conglomerations a general 
behaviour emerge in the study of the scaling laws which 
characterize the global street network structure. 

These hints of universal behaviours do not imply that 
different cities look the same. In fact, different urban¬ 
ization processes shape cities in very different ways, in 
terms of morphology and size. Nevertheless, the com¬ 
patibility between the exponents for the analysed quan¬ 
tities suggests that there might be common principles for 
the growth of cities. If this is the case, then cities at a 
specific point in time represent different states of the evo¬ 
lutionary process. We will then expect to find a similar 
behaviour if we looked at the evolution in time of a spe¬ 
cific city. In order to test this hypothesis, we consider a 
unique dataset recording the evolution of street networks 
of Greater London between 1786 and 2010, through nine 


well spaced temporal intervals defined by the maps shown 
in Fig. [5] (see SI Sec. I C for more info). 

In Fig. [6l we perform a simple test, by measuring the 
aforementioned quantities in the contemporary UK ur¬ 
ban street networks and in the historical London dataset. 
Interestingly enough, for L{N) in the UK, the historical 
dataset overlaps with the spatial dataset and both allo- 
metric fittings are consistent over a linear regime. As 
we stated above, this means an overall homogeneity in 
terms of the average connectivity (k) and the average 
street segment length (/), that is preserved over time. 
For A{N), even if the points do not really overlap, the 
allometric behaviour is consistent between the time and 
space averages in the slightly super-linear regime. 


II. DISCUSSION 

Two important results can be derived from our study. 
On the one hand, we provided a methodology to define 
city boundaries through spatial urban networks in a uni¬ 
versal way. On the other, we explored the generality of 
some scaling laws related to urban street networks. Both 
of these aspects relate to the quest for methodological 
advancements in the analysis of spatial urban networks, 
and they relate to the discussion of important statisti¬ 
cal phenomena, such as those described by Zipf’s and 
Gibrat’s law. 

Regarding the concept of city boundaries, we dis¬ 
covered universal properties of street networks related 
to clustering properties in the street intersection space. 
These properties allow us to distinguish the urban ag¬ 
glomerate with a methodology that is parameter free and 
that reduces the problem to extract city boundaries to a 
simple clustering process on a spatial point pattern. 

The concept of city boundaries is very important to 
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FIG. 5: Historical London Street intersections of the city 
cores of London as defined by the condensation threshold from 
1786 to 2010. 
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FIG. 6: Space-Time Analysis Left panel: total length of 
the street network L(N) for the historical London dataset 
compared to the actual UK urban street networks. Right 
panel: area of the street network A(N) for the historical Lon¬ 
don dataset compared to the actual UK urban street net¬ 
works. 
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distinguish between urban and rural networks. We show 
that allometries found in urban street networks consis¬ 
tently differ from the ones found in rural street networks. 
This means that an ill posed definition of boundaries, 
such as arbitrary administrative boundaries, would mix 
the properties of street networks that are in two distinct 
phases of their evolution, producing spurious results (see 
SI Sec. Ill for a direct example). 

Regarding our analysis about the generality in space 
and time of relevant allometries found in urban street net¬ 
works, we chose two very distinct datasets, that present 
different urbanisation paths. While cities in the UK are 
mostly of Roman or Medieval origin and reflect a long 
line of urban evolution spanning two millennia, cities in 
California are mostly the result of an urban explosion 
during the latter half of the nineteenth and the twenti¬ 
eth centuries. In this context, we find that urban street 
networks display compatible properties, even though the 
datasets are very different. This highlights how the city 
is an overall homogeneous structure in terms of its street 
network quantities (average degree, average street length, 
etc.). These findings are confirmed by our analysis, which 
compares the structure of the urban street networks in 
the UK with the street networks of the historical evolu¬ 
tion of London during more than two centuries. Even 
if these results are not definitive, a general behaviour 
for the found exponents cannot be excluded at this point 
and new perspectives of research in this direction are thus 
opened. 

Spatial networks are widespread in nature and it is pos¬ 
sible to see how the organization of spatially embedded 
structures are often similar for a variety of different phe¬ 
nomena. Leaf venation, crack pattern formation, river 
networks, ant galleries, circulatory systems, soap froths, 
pipe networks and so on, have been studied in a wide 
range of disciplines which are often strongly related Hit 
[ 2 ^. In particular, brain networks seem to share a number 
of similarities with the organization of spatial street net¬ 


works, due to their high modularity and fractal structure 

0. 

Even though cities present a diverse range of morpho¬ 
logical features, we have shown that the boundaries of 
cities can be identified through universal properties of 
street networks. This opens up new research perspec¬ 
tives in terms of the analysis of the logistic parameters 
for each city. As cities undergo different stages of evo¬ 
lution, related either to expansion and to condensation 
phases, those different evolution phases could be easily 
recognised and classified from the deviations in the lo¬ 
gistic curve related to the clustering process (see SI, Sec. 
II-A). 

Moreover, from our analysis we can derive a broad 
picture of the way a city evolves. What we observe is 
that the street network can be found in two very dis¬ 
tinct phases, the rural one, which is not characterized by 
any distinctive properties, and the urban one which is 
characterized by high density of intersections which are 
distributed in patterns that are mostly homogeneous and 
which carry very similar exponents. In such a picture a 
city street network develops as an articulated organism 
territorializing the sparse rural street network, filling the 
space with denser residential patterns and then radically 
changing its morphology. 

A key advantage of our method of analysis, compared 
to other existing approaches, such as those based on data 
extracted from satellite imagery, is the ease of use. Re¬ 
cent advances in GIS technologies have led to the prolifer¬ 
ation of street network data generated by public and pri¬ 
vate entities. Our study demonstrates that these datasets 
can be deployed in new ways to analyse key properties 
of cities, enhancing our ability to manage the built en¬ 
vironment. A disadvantage of our methodology, as it is 
presented in this form, derives from its bottom-up ap¬ 
proach. As a matter of fact, it is especially indicated 
to extract a limited number of cities, as the extraction 
procedure could not be completely automated and needs 
eye inspection (see SI Sec. II A). In order to extract a 
large number of cities top-down techniques, such as the 
one presented in are definitively more efficient, even 
if less precise. 
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